CERN-PH-TH/2010-120 

Measuring the W-Boson mass at a hadron collider: 
a study of phase-space singularity methods 

A. De Rujula"'^'^ 

^Instituto de Fisica Teorica (UAM/CSIC), Univ. Autonoma de Madrid, Madrid, and CIEMAT, Madrid, Spain, 

^Physics Dept., Boston University, Boston, MA 02215, 
""Physics Department, CERN, CH 1211 Geneva 23, Switzerland 

A. Galindo'^'^ 

^ Departamento de Fisica, Universidad Complutense, Madrid, Spain, ^CIEMAT, Madrid, Spain 

(Dated: January 26, 2013) 

The traditional method to measure the W Boson mass at a hadron coUider (more precisely, its 
ratio to the Z boson mass) utilizes the distributions of three variables in events where the W de- 
cays into an electron or a muon: the charged lepton transverse momentum, the missing transverse 
energy and the transverse mass of the lepton pair. We study the putative advantages of the addi- 
tional measurement of a fourth variable: an improved phase space singularity mass. This variable is 
statistically optimal, and simultaneously exploits the longitudinal- and transverse-momentum dis- 
tributions of the charged lepton. Though the process we discuss is one of the simplest realistic ones 
involving just one unobservable particle, it is fairly nontrivial and constitutes a good "training" 
example for the scrutiny of phenomena involving invisible objects. Our graphical analysis of the 
phase space is akin to that of a Dalitz plot, extended to such processes. 



PACS numbers: 

I. PROLEGOMENA 

Neutrinos -and perhaps novel weakly-interacting 
particles- escape unobserved from the collisions in which 
they are produced. In the corresponding "missing en- 
ergy" events, the reconstruction of the masses of the 
parent particles and the specification of the underlying 
process are challenging because there are typically fewer 
kinematical constraints than unknowns. At a hadron col- 
lider this situation is rendered even thornier, since par- 
ticles produced at small angles also escape undetected. 
This prohibits the determination of the longitudinal mo- 
mentum of the center of mass system of the colliding 
partons. 

The above limitations confer a higher standing to ob- 
servables exclusively dependent on transverse momenta 
[1], or otherwise invariant under longitudinal boosts [2 . 
In principle, transverse observables are insensitive to the 
significant uncertainties associated with the (longitudi- 
nal) parton distribution functions (pdfs). In practice the 
uncertainties are to some extent reintroduced via the an- 
gular coverage limitations of an actual experiment, which 
are not invariant under longitudinal boosts. 

The quintessential transverse observable is the trans- 
verse mass, of VF-discovery fame. In an event at a hadron 
collider, consider the production of a single W, followed 
by its decay W — > ^z/, with I an electron, a muon, or one 
of their antiparticles. Denote hy x = (xo,x^,X3) and 
/ = {lo^lT^h) the neutrino and charged lepton fourmo- 
menta, respectively. Here = (h^h) and = (xi,X2) 
are the momenta of the leptons in the plane transverse to 



the beam direction(s), and = (^1,^2) the analogous 
quantity for the observed final state hadrons. The tradi- 
tional "transverse mass", a function of It and pr, whose 
distribution is used to infer the W boson mass, is [l] 

= 2l^x^[l-cosA^{x^J^)] 

X rp 9 X rp I ij rjp \ P rp 0^ ^ ^ 

where A^{x^ , /^) is the angle between the transverse lep- 
ton directions. The most precise determination of the 
mass of the by a single experiment is the one by 
D0 [3 . In spite of the relatively unfavorable environ- 
ment of a hadron collider, its large statistics results in 
a value with an overall error smaller than that of the 
LEP experiments. The D0 result is based on the decays 
W ^ ev^ and the measurement of three highly corre- 
lated transverse observables: the traditional "transverse 
mass" function [1 , the lepton's transverse energy and the 
total missing transverse energy. The result: 

Mw = 80.401 ± 0.043 GeV, (2) 

stems from an actual measurement of Mw/Mz- But 
Mz was determined with exquisite precision at LEP. The 
PDG quotes Mz = 91.1876 ± 0.0021 GeV i4j. 

The procedure to extract Mw from the distributions 
in transverse mass, lepton momentum and total missing 
energy is as follows. A finely spaced set of input W bo- 
son masses, M, is used to generate a set of "templates" : 
the "Monte Carlo" (MC) expectations for the observed 
distributions, with all their experimental cuts, estimated 
uncertainties, calorimeter responses, etc. The x^{M) val- 
ues for the comparison of data and expectations are fit 
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to a quadratic form, from whose minimum and width 
Mw and its estimated error are inferred. Naturahy, ah 
the procedure is tested and cahbrated by the observed 
Z-production and leptonic decay (into e+e~, in the D0 
case). 

In order of decreasing incidence on the error in Eq. ([2]), 
the hmitations are the electron's energy cahbration, the 
uncertainties on the pdfs, and the statistics. For this 
particular measurement, the backgrounds are well un- 
derstood and quite negligible. 

Given the large statistics already gathered at the Teva- 
tron collider, and with the advent of the LHC as a high 
statistics precision physics tool, the main limitation of 
a hadron collider determination of the W mass from its 
decays into electrons and muons is likely to be the pdf 
uncertainty. At the LHC this problem is in particular 
exacerbated [5 by the fact that it is a pp^ not a pp col- 
lider, and the quark pdfs in a proton -or the identical 
antiquark pdfs in an antiproton- are much better known 
than the antiquark pdfs in a proton. 



II. INTRODUCTION 

A ginormous amount of attention has been paid 
to hypothetical processes involving neutral, long-lived, 
weakly-interacting final state particles that can only be 
indirectly detected. A prototypical example is the pair 
production of squarks followed by their decays into quark 
plus neutralino. Such processes generally involve two or 
more particles of unknown masses. 

The first aim in the missing particle searches for 
physics beyond the Standard Model is the establishment 
or the exclusion of a signal, both tantamount to an ef- 
ficient suppression of backgrounds. Some novel longitu- 
dinal boost invariant variables are a very good choice in 
this endeavor [2 , as demonstrated by the data analysis 
in [6 . 

A longer range aim is the measurement of unknown 
masses, when there are more than one and a candidate 
process is selected. In this connection, a very general 
algebraic singularity method has been advocated [7], in- 
volving the use of a "singularity variable" (SV), allegedly 
more powerful than that of a singularity "condition" 
(SC), such as the one leading, as we shall see, to the 
result of Eq. 

It is too late to discover the W, though not to at- 
tempt to measure its mass even better, a relevant task 
in checking the consistency of the Standard Model and 
constraining the mass of its hypothetical scalar. With 
this ab-initio motivation, we have exhaustively studied 
the phase space for W production and leptonic decay, a 
simple undertaking analogous to the analysis of a Dalitz 
plot, but with incomplete kinematical information ( ^IV[ ). 

We have also studied the singularities of this phase 
space, and their use in constraining the W mass (^IV 
and |V|) . We identify the criterion for the theoretically 
optimal SV and derive its explicit form (^VT VIII| and 



|X|. En passant, we find that other nonoptimal SVs, such 
as the one proposed in ^ , are "dangero us", i n that their 
distributions display fake singularities (^VII). 

The singularity variables we study involve the mea- 
sured longitudinal momentum of the charged lepton, ^3. 
This longitudinal information is obviously additive to the 
transverse information exploited in observables such as 
M|,, but is highly correlated with it (^IX). The ^3 distri- 
bution directly refiects the pdfs of merging quarks and 
antiquarks of different fiavor. Recent progress in QCD 
fits and in calculations well beyond the leading order al- 
lows one to hope that -eventually- the dominant limi- 
tations concerning the problem at hand will not be the 
theoretical pdf uncertainties, but the limited calorimetric 
resolutions. 

Given a trustable set of pdfs, one can simulate the ob- 
servable distribution of events dN/{dls d'^lj. (Ppj.) for a 
set of input trial masses and contrast it with observation. 
This comparison involves the five relevant variables and 
their correlations; it has no statistically superior com- 
petitor. Why then study any alternatives? Besides the 
pleasure of understanding with use of one's own neural 
network, there is the motivation of paving the way of 
searches for other processes involving unobservable par- 
ticles, for which it is a-priori prohibitive to simulate all 
possibilities. 

In this note we report on a thorough theoretical study 
of the extraction of phase space information from single- 
W signal events, but we use the standard model of W 
production and decay only to leading order. We entirely 
ignore the backgrounds, which are well known to be very 
modest for this particular process. A reason for these 
choices is that only the experimentalists themselves can 
fully model the detector's effects and backgrounds, and 
that this modeling is independent from the theoretical 
issues on which we focus. 



III. LINGUISTIC QUANDARIES 

Based on equations such as = (/ + x)^, we shall 
be drawn to give a plethora of meanings to what is, for 
starters, simply a letter: "M". It ends up being every- 
thing else. The resemblance to M-theory is coincidental. 

Naturally, M may stand for the physical or measured 
Mw, as well as for its Lorentzian distribution, when the 
width is not neglected. But it may also, as in the case of 
the transverse mass, M^, be a non-Lorentzian function 
of other observables. 

In analyzing data, one compares them with MC gener- 
ated distributions that depend on an ensemble of input 
"trial masses", for which we reserve the label M. A dif- 
ferent type of trial masses, which we call appears in 
"singularity variables'\ which are functions of observable 
momenta and of A4. Not to make this complex linguis- 
tic heritage hereditary, we label the singularity variables 
"S" (and not once more "M", as in the function) 
thereby not introducing new meanings to the symbol M 
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or the word "mass" . 



IV. SINGLE-iy PHASE SPACE 

The fuh information relevant to the reconstruction of 
the W mass is embedded in the kinematical equations: 



E^ ^l2^X2 +P2 = 



(3) 
(4) 
(5) 
(6) 



where we have made the approximation = for the 
charged lepton. The equations are incomplete in that 
the u longitudinal momentum, X3, is unconstrained, pre- 
cluding a direct determination of the W boson mass from 
a "mass peak". Is there a systematic way to extract the 
kinematically most stringent information on Mw^ 

To answer this question it is useful to study first the 
phase space described by Eqs.(|3][6| in a simplified case. 
If the energy and transverse momentum of the observed 
hadrons could be measured with precision, it would be 
possible to boost every event to the = frame. To 
(temporarily) simplify the algebra, let us just adopt this 
constraint. Solve the linear equations ^2,^3,^4 to ex- 
press xo,xi,X2 as functions of X3. Substitute the result 
in El to obtain the phase space 



$(/t,/3,^3,M) = 



ll 



ij) 

(8) 
(9) 



It will be useful to consider the two solutions to Eq.Q 
in X3 = xs{It,13,M): 



41^ 



(10) 



With no loss of generality, and to be able to plot the 
phase space, do three more things. Take ^3 to be pos- 
itive if directed along the direction of a given (fixed) 
proton beam. Define the It of Eq. ([9| to be positive 
if directed above the beams, negative otherwise. The 
function <I>(/t, ^3, ^3) = 0, from divers points of view, 
is plotted in Fig. [l] Along the (blue) straight lines the 
planes tangent to the phase space contain one "visible" 
direction, ^3, and the "invisible" direction X3. The pro- 
jection of phase space into the visible directions (Ir^h) 
is bounded by the lines It = ±M/2. 

The boundaries of the phase space projected along an 
invisible direction onto the space of the visible ones, = 
are an example of singularity condition (s). At 
their location there is a single invisible coordinate xs for 
fixed values (/t, ^3) of the visible ones, as opposed to the 





Figure 1: Three views of the phase space function $ of 
Eq. I?]), with the momenta {It, h and X3) in units of M. 
The black lines cut the surface at fixed It or I3 and the green 
ellipses at fixed W3 = /s + 2:3 , the longitudinal momentum of 
the W. The (blue) lines at It = =tl/2, X3 = I3 are singu- 
lar. A point in the (/t, ^3) plane corresponds to two values of 

X3 = X^{lT,h)- 
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two of the general case in Eq. ([To|, and the projected 
phase space density is not smooth]? . 

In practice two cuts have to be apphed to the momen- 
tum of the observed lepton. We adopt I/3I < 5|/t| (re- 
sulting from a pseudo-rapidity limitation \ f]\ < 2.3) and a 
rather demandingly low \It\ > 10 GeV. These cuts result 
in the unobservability of a large fraction of phase space: 
the (red) domain shown without a mesh in Fig. [2] The 
maximum |x3| = O{b0) Mw happens to be close to the 
absolute kinematical limit, approximately \xs\ < Ep^ at 
the current LHC energy, Ep = 3.5 TeV. This was proba- 
bly not the main reason to choose this machine energy. 




Figure 2: The same as Fig.[l] but in a different, more exten- 
sive, domain of (Ir^h^xs). The finite dashed (green) domain 
is what survives the typical experimental cuts on It and fy. A 
(yellow) plane tangent to the phase space surface $ = along 
the singularity line at It/M — —1/2 is shown at the left; it 
contains the invisible direction X3. The arrow is orthogonal 
to the phase space $ = at a point in it, and extends from 
this point to the tangent plane. 



that, in the space {x} of invisible directions, the row vec- 
tors of the Jacobian matrix Dij = dEi / dxj (with the row 
index i running along the number of equations and the 
column index j over the number of invisible coordinates) 
be linearly dependent, so that the derivative relative to 
an x-direction normal to these vectors be zero. In other 
words, at a singularity, the rank of Dij must be smaller 
than its rank at nonsingular points [7]- 

For the general single- W case we are discussing 
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^_ d{Ei,E2,Es,E4) _^ 

d{xo,Xi,X2,X3) 



and the reduced rank condition is 

Ec ^ Det D (X Iq xs — I3 xq 



(11) 



(12) 



The same condition is obtained in the pt = exam- 
ple. Combining it with Eq. ^ results in xs = I3, the 
phase space boundaries shown as straight (blue) lines in 
Fig. 0. 



B. The function 



The general case with nonvanishing is treated with 
equal ease. Eli minate the four variables x to solve the 
five equations ( 3]|6|12 ) in M. The result is Ht = 0, with: 



T.t{M,1^,P^) = 

M^-AM^ (C • + ) + 4 [(C • p^ f -IIpI] (13) 
Of the four M-roots of St = 0, one is not unphysical 



Mt{1^,P^)-- 



\1^\\P ^ l\r ^ It ■ {It ^ Pt) , (14) 



which reduces to Mt = 2\l^\ for = 0. The function 
Mt^ of Eq. (14) is the consuetudinary of Eq. 



In simple cases such as the one at hand the singularity 
condition can be directly obtained. The It boundary is 
the projection of the phase space points at which the tan- 
gent plane is vertical and contains the invisible direction 
X3. At these points d^{lTj3^xs)/dxs = 0. Eliminating 
M from this expression and Eq. (ItI one obtains xs = Is. 
At these boundaries =4/^. 



A. The formal singularity condition 

The procedure of the last paragraph requires some 
guesswork, but can be rendered entirely general and sys- 
tematic. At a singularity one or more of the invisible 
directions are contained in the tangent plane to the full 
phase space. The general condition for this to happen is 



V. KIM'S SINGULARITY VARIABLE 

Discussing the general case with an arbitrary number 
of invisible final state particles, Kim has argued |7j that 
the use of a "singularity variable" (SV) is more powerful 
than that of a singularity "condition" (SC), such as the 
one leading to the result of Eq. (14). 

Kim requires a SV to have four properties [7]: 

(i) To vanish at the singularity. 

(ii) To be perpendicular -at the singularity- to the phase 
space surface in the observable directions. 

(iii) To be "normalized such that every event can give 
the same significance". 

(iv) To be computed to first nontrivial order (the second 
fundamental form) in the distance between a phase space 
point and the nearest singularity. 
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Our interpretation of these formal looking choices is 
the following. Condition (i) is the only scale invariant 
stipulation. At the singularity, condition (ii) entails a 
maximal sensitivity to the unknown masses. Condition 

(iii) ensures that two events with the same distance to the 
singularity be treated on equal footing. The requirement 

(iv) is one way to make the procedure general. 

To fathom all this it is useful to jump momentarily 
to the result of Kim's prescription in our single- 1^ case. 
The SV (more precisely, the singularity function) is: 



T 



211 



41^ 



(15) 



with TiT as in Eq. (13), and M substituted for A4, as its 



role will now be that of a trial mass. For =0 this SV 
reduces to: 



7^2 (_A^2_4;2) 



(16) 



Refer for a moment to the limit F ^ for the W width 
and a situation with no measurement uncertainties. Con- 
sider a set of N real or MC generated events, i.e. a 
list of values of and the histograms dN{M)/da 

of the corresponding values of a = T^{M^l^Pj.)^ for dif- 
ferent choices of M. For M = Mw^ the real or "MC 
true" value of the W boson mass, the singularity is at 
(7 = 0, dN{M)/da peaks at that point and vanishes for 
cr < 0. For a fixed data set and varying the function 
dN{A4)/d(j varies in shape, but obviously not in statisti- 
cally useful content. We shall later illustrate these points 
in detail. 

The use of an "implicit" variable Ai may seem to be 
an overkill. In the single-W case with pt = 0, it is. One 
could equally well erase A4 in Eq. (16) and use the SV: 



^iiM,l) 



11+211 



II 



(17) 



4/|., embodies two 



which, in conjunction with M.^ 
projections of the full distribution dN/{dlT dls). 

Contrariwise, one could make the singularity condition 
into a singularity variable with an implicit A4: 



(18) 



and consider the distributions dN{A4)/da^. But the in- 
formation that these distributions contain is precisely the 
same as that of the distribution dN/dl'^^ the correspond- 
ing histograms are just mirror reflected and shifted rela- 
tive to one another. 

The above unfavorable commentaries on implicit vari- 
ables are by no means general. Even in the single- 1^ 
case, for pt ^ 0, it will not be possible to "erase" A4 
from Eq. (15) in the same cavalier spirit in which we 
erased it from Eq. ^16^ to obtain Eq. (17). Singular- 



ity variables should be of particular practical relevance 
in problems with more than one unknown mass or unob- 
servable particle, for which the labor of making templates 



for all possibilities may be out of the question. There, at 
least at the discovery stage, "clever" variables may be 
useful to zoom kinematically to the relevant mass ranges 
before a full analysis is to be contemplated, as discussed 
in ^. 



VI. THE QUEST FOR AN OPTIMAL 
VARIABLE 

It is instructive to consider a trivial example with one 
visible variable, and a single invisible one, x, con- 
strained by the "Euclidean phase space" equation 







(19) 



This apparently arbitrary instance actually corresponds 
to an imaginable process, that of a particle decaying into 
an invisible one, X, and a visible one that happens to 
be at rest. The longitudinal momentum of X is x and 
its transverse one, /, is measured via the usual transverse 
balance. M is a combination of the masses involved [8]. 

The value of the unknown quantity M in Eq. ( p!9| is 
encoded in the /-distribution. The Jacobian matrix is 
D = d^/dx = 2x. The constraint that its rank be re- 
duced is X = 0, resulting in the SCs / = ±M. For a given 
"observed" there are two points P in <l>. Their nearest 
singularity is the point 5, as illustrated in Fig. [3j 




Figure 3: P is a point in "phase space" of which only the 
corresponding / is measured. S is the closest singularity to it. 
The length of the three arrows and the angle u are used to 
construct various singularity variables. 



Following Kim's method [7|, we obtain for the SV 

171 n 2 



T.k{M,1)=u^ 



arccos 



M 



(20) 



proportional to the squared (angular or geodesic) P to S 
distance measured on the ^ surface. In a less trivial case. 
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the resulting SV would have been the same distance on 
the quadratic approximation to <l> around S. 

There is nothing sacred about the elegant result of 
Eq. (20). There are other SVs that (up to an overall 



normalization) coincide with u to second order. Three 
examples, illustrated in Fig. |3) are: 

• (1) The distance between P and the hyperplane, 

tangent to ^ at 5* (the dotted vertical line, in 
this case). This distance is the horizontal arrow. 

• (2) The P to H distance along the normal direction 
to ^ at P: the slanted arrow. 

• (3) The square of the length of the vertical arrow. 



In the notation of Eq. ( 20 ) and normalized so that they 
coincide with to Oivr), these SVs are: 



J^iiM.l) = 2 [1 -cosii] 
^2{M,l) = 2 [1/cosii- 1] 
T^^{M,l) = sin^ii 



(21) 
(22) 
(23) 



Note that Si is the 2D analog of the singularity condition 
used as a SV, as in Eq. ( 18 ) . That is to say, it is equivalent 



to the transverse mass distribution. 



Is any of these SVs in Eqs. (20) to (23) "the best" in 



some useful sense? To answer, consider the distributions 
of the numerical values a of the various functions, for 
fixed M (a zero width resonance): 



dN 
da 



J dx dl S{x^ + /2 - M^) 5[cT - ^i{M, I)] (24) 



Recalling Eq. (19), and in particle physics language, 
dxdl5{^) is the phase space. Hi is the distribution of 
the T^i values. Monte Carlo generated "diagonal" his- 
tograms, 1-Li{(j ^ M ^ M) ^ would be the templates for vari- 
ous trial choices of M. 



In the four cases of Eqs. (20) to (23), with the notation 
p = M./M^ and normalized to unit integral in the allowed 
range of the corresponding a, the distributions are 



nK = 
Hi = 



psm^cr 2 -1 2 /.I 

/ , = . ^ ^ [arccosV \^ 1^] 
Try 1 — '"^^^ 



' cos^ y <j 
P 



^Vl-p2+p2(^_^2/4) 



^(2 + a)V(2 + a)2- V 



, (TG [2(l-p-i),2] 
, ere [2(p-l),oo) 



rVl-p'(l-^)vT^ 



, (TG [l-p-^l] (25) 



In the simple case at hand, one need not refer to "non- 
diagonal" histograms Hi(cr, M, A^), that involve the im- 
plicit variable M. ^ M. In more blind searches with 
several unknown masses this may no longer be the case. 



Moreover the nondiagonal histograms provide one way to 
ascertain the "goodness" of their SV. 

To quantify the amount by which the distribution of a 
given SV is sensitive to the difference between a "true" 
mass M. = M and a variation thereof, = M + AM, 
define the "statistical squared derivative", x^? ^^id its 
integral |9J 



1 



Hi{a, M, M) 
/ Xl{cr)da 



d-Hija, M, M) 
dM 



1 2 



M=M 



(26) 



The notation refiects the parentage of with the usual 
measure; it is also the square of the geometrical mean 
between ordinary and logarithmic derivatives. "Statisti- 
cal" refiects the fact that X^{cr) is a local measure of a 
variation relative to the one expected from a standard de- 
viation of 1(7 size. In this hypothetical case with sharply 
defined cuts in a, is singular at a = 0. Regularizing 
the singularity with a cut a > ctq > we obtain: 



croiO OTT 

2 . 



-3/2 



-3/2 



(1 



D. 



2 

^ -— c 

cToiO OTT 

2 



2ao)+o(l), 

0(1), 



15 



-3/2 



-3/2 



21 

"8 



(27) 



cro 



io 37r 



^o(l), 

-0(1). 



The singularities of the different Hi are all oc l/^/a 
and have been equally normalized by construction (and 
for a fair comparison). The sensitivity to the value of M 
is maximal close to the singularity. This sensitivity puts 
the SVs of Eqs. (I20|) to (l23l) in the "goodness" order 



Ti2 y >^ ^1 >^ ^3 



(28) 



dictated by the second term in brackets in Eqs. (27). The 
fully "orthogonal" SV S2 is the contest's winner. The 
usual transverse mass distribution (Si in this simplifica- 
tion) does not fare well. 

So far there seems to be no compelling reason not to 
have made the above variable-comparing analysis with 
M = for starters. But in a more realistic case M 
would stand for the central value of a distribution of non 
zero natural width, while A4 is just an auxiliary quantity 
introduced for analysis purposes. 

To illustrate the above, and to convey the numerical 
meaning of Eqs . (27), substitute the sharp definition of M 
m Eqs. ( |19|24D by the one corresponding to a resonance 
of mass M and width F: 



1 



Mr 



(29) 



TT {P^x^-M^y^M^r^ 

This corresponds to "spreading" the circle of Fig. (|3| 
and "scanning" it with circles of varying -but sharply 
defined- with the help of different "S" scanners. 
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Results for the distributions for Kim's variable and the 
orthogonal SV are shown in the upper Fig. (|4|. The 
lower figure shows their X^{cr) around the <j = singular 
point, the domain to which the Hi distributions are most 
sensitive to the unknown A4. The figures are drawn for 
M = A^ = l,r = 0.3, showing how the orthogonal 
S2 is better than S^- However, the difference is not 
large and, for a narrow resonance (or one whose width is 
masked by detector effects) it would be negligible, as the 
relative differences close to a = between the Xii^) of 
the various SVs diminish linearly as T/M 0. 



0.4 
0.3 

■<s> 
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0.1 
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Figure 4: Top: the dHi{a, M, F, M) /da distributions for the 
SVs i = i^,2 for M = = 1, F = 0.3. Bottom: the 
corresponding statistical squared derivatives. 



The Di integrals of Eq. ( 26 ) over their complete respec- 



tive kinematical domains are numerically similar, appar- 
ently demonstrating that, in toto, all variables are statis- 
tically equivalent. In practice this is not the case. The 
signal-to-noise ratios of the distributions are increasingly 
unfavorable as one moves away from the ~ neigh- 
borhood of the signal's peak. 

We have proven that is better than others, but not 
that it is the best. Its optimality, however, appears to 



be intuitively obvious. The phase space ^ of Eq. (19) 
simply scales as M changes. The optimal SV ought to 
maximize the dependence on M at every point in phase 
space. This dependence is maximal in the direction or- 
thogonal to <l>. The variable S2 measures a distance to 
the nearest singularity, in that preferred direction. 



VII. INDUCED SINGULARITIES 

Let us return to the case of single-FF production and 
model the simplified = instance as stated in the 
ending paragraph of ^TI| that is, to leading order. We use 
the quark and antiquark parton distribution functions of 
[To] at an LHC energy of ^/s = 7 TeV and apply the cuts 
\It\ > 10 GeV and |7^| < 2.3 to the charged lepton. We 
ignore the difference between VF+ and W~ production. 

We choose to present results for the distribution of the 
values, cr, of the function: 



(30) 



which differs from Eq. ( 16 ) by a factor 4 1^. This does not 
affect the arguments tofollow. Moreover, in conjunction 
with the transverse mass (4/|.) distribution, the use of 
Eqs. (16) or (30) are equivalent. 

results in an interesting sur- 



A heedless use of Eq. (|30|) 
prise, illustrated in the top panel of Fig. [5] The his- 
togram has two peaks, one of them significantly above 
the expected singularity at a = 0. The peaks fuse as one 
lets the W have its rather narrow width, F/M 0.02, as 
illustrated in the lower panel of Fig. |5] Still, the fused 
peak is not just the expected singularity at the origin of 
the SV and the issue calls for understanding. 

Consider restricting the phase space of Eqs. ^ and 
Fig. [1] to its slices at fixed longitudinal momentum of the 
VF3 = X3 + ^3, shown in these plots as (green) ellipses 
(in practice this can only be done at a monochromatic 
eVe collider). The distribution 1-L{cf^M^M.^W^) is shown 
on the upper Fig. [6) for M = = 1, VF3 = 2. It has 
two singularities besides the one expected at a = 0. 

The origin of the singularities is clarified in the lower 
Fig. [6j where the curve is the phase space ^(l^^cr)^ again 
for M = 1, VF3 = 2. A uniform distribution of events 
along ^(/3,cr), projected on the a axis, has three cumu- 
lation points at the projections of the vertical tangents. 
The one at the edge is the expected a = singularity, 
the other two are induced singularities. In these Mw = 1 
units, for W3 < 1 there is no induced singularity, for 
Ws = 1 there is one and for Ws > 1 there are two. One 
induced singularity survives the integration over the Ws 
distribution, as shown in Fig. |5| 

The source of the induced singularities is the specific 
form of the SV in Eq. ( [3Q| ) -or of the formal SV of 
Eq. (16)- which results in a fixed- VF3 phase space the 



curvature of whose surface is not everywhere of the same 
sign. The induced singularities are not endpoints, but 
are event accumulation points for the same reason as the 
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Figure 5: Top: The singularity variable of Eq. (17) results, for 



a narrow resonance, in a distribution with an extra singularity 
away from cr = 0. Bottom: The small width of the W suffices 
to merge the singularities, shifting the resulting peak away 
from a — 0. 



endpoints, i.e. the tangent manifold to the phase space 
at their locations contains invisible directions. 

In a process with just one mass scale to disentangle, 
the complications we just discussed are a lesser problem. 
In a process with more than one mass scale, they are a 
putative source of confusion. The fully orthogonal SV S2 
of Eq. (22) does not result in induced singularities. 
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Figure 6: Top: The phase space of Eqs. ([7| and Fig. [l] for 
a fixed W3 = X3 + Z3 = 2 results, for a narrow resonance, in 
a triple peaked distribution (all quantities in units of M = 1 
units). The singularities occur at values of a where the phase 
space $(/3,cr) has vertical I3 projections. 



ends in the plane tangent to the phase space surface at 
the singularity line. 

Define the unit vector n orthogonal to the surface 
^h^k.xs.M) of Eq. (0: 



VIII. RESULTS 

For the single-W case at hand, consider the "fully or- 
thogonal" variable akin to T12 in Eq. ( [22| ). We call it 
T^A and discuss it first in the pr = instance. Its ge- 
ometrical interpretation is depicted in Fig. (|2|; is a 
measure of the length of the arrow, which is orthogonal 
to a phase space point P with coordinates (/t, ^3, ^3) and 



N = {Ni,N2,Ns) = {d^/dlT^d^/dls^d^/dxs) 
n = N/\N\ (31) 

The length, S^, of the orthogonal segment joining P with 
a point in the plane tangent to the singularity is such that 

M 

I -^=lT-^Ani (32) 
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Figure 7: Top: Histogram I-Lt of the distribution of the square 
of the transverse mass, for M — 1. Center: Histogram I-L2 of 
the distribution of the values (72 of the optimal SV Ea of 
Eq. (33), for M — M — 1. Bottom: same as center, for 



More explicitly 

T.a{ItM,M) 



M/2-h 



21t (M^ + Wi) 



IM^ (2I| ■ 



wi - 213W3) + 8W|4 



W3 = h+X3{lT,h,M) 



(33) 



with xs as in Eq. ( |1Q[ ). For each (Ir^h) pair (an event) 
there are two equal probability solutions, the two roots 
of the equation. In generating events we chose at random 
the ± sign in Eq. ( [1Q| ). 

We show in Fig. 171 the pr = results for the m|. and 
T^A distributions. All three graphs are generated for a 
peak mass of the W, M = 1. As shown in the bottom 
figure, for a trial mass M M the peak of the distribu- 
tion shifts away from cr^i = 0, becoming wider and, for 
A4 < M, double peaked: there is for this "bad" choice an 
induced singularity, even for the optimal SV. Naturally, 
the histograms with M ^ M are not statistically inde- 
pendent from the M = M one. While they may be used 
to "focus" on the correct choice of A^, the extraction 
of information on the W boson mass would ultimately 
hinge on a set of templates for M = M values close to 
its currently measured value. 

The value of xs is not always real. When the value of 
chosen by the Lorentzian distribution of physical (or 
MC generated) values of Mw is such that 41'^ > AA^ , 
Xs involves the square root of a negative number. There 
is nothing pathological about these events. The way to 
"recover" them is to set: 



If Im (Ea) ^ 0; then ^ -Ahs{^A) (34) 

In the middle Fig. ([7|), for example, the recovered events 
are those at (72 < 0. 



IX. CORRELATIONS 



It is clear that the transverse mass -or its equivalent 



Et of Eq. ([18|)- and the SV of Eq. (|33|) are highly corre- 
lated. They both vanish at the singularity as A4 — 21t- 
To illustrate the point, define the variable 



(35) 



which has the same mass dimensionality as E^ and, close 
to the singularity, carries the same information as E^. 
The double histogram dA^/dE^ dE^, shown in Fig. [sj il- 
lustrates the expected correlation. 

Naturally, correlations between observables constitute 
a weakness of their ensemble, to which we shall come 
back in the conclusions. Suffice it to say here that in the 
"signal only" case at hand, there is only one mass scale to 
extract from the data: the correlations are unavoidable. 



different values oi M. In all cases pr = 0. 
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Figure 8: The correlation betwe en t he SV of Eq. (33| and the 
SC expressed as the SV of Eq. (ISSl, for = M = 



X. THE GENERAL CASE 



In Figs. (T|2) we have profited from the fact that the 
Pt = phase space of Eq. ([9| is a function of /f. to plot 
the phase space for negative and positive It- For pr 
this is no longer possible. Let It and pr be the moduli 
of the corresponding vectors and 6 be the angle between 
them. The general case phase space is then: 

$(/3,^3,^T,cos6>,pT,M) = (36) 

(-2 hicosOpT + /t) + 2 ^3^3 + M^)^ 

-4 (/3^ + /t^) {2 cose It Pt ^ h'^ ^Pt^ ^xs^) =0 



for which the generalization of the pt 
Eq. (IT0|) is 



result of 



^3 (M, /a, cos i9,pt) = 
M2 



(37) 



and that of |/t| < M/2 is 



/t"'^"(M, cos 6>,pt) = 



^JM^ + p\ + Pt cos{0) 



(38) 



The statistically optimal T^a is computed exactly as in 
the previous section, with the result: 



T^Aih, ^3, cos 0, Pt, M) 



It — h 



\M) 



ni{M) 



(39) 



space function of Eq. ( |36[ ). More explicitly: 

N^ = -4[pTCos{0)(2kW3 + M^) 
+ 21t {M^ + Pt sin\e) + W|) ] 

- 8lT{h + W3)pTCOs{0) 

N3 = 4I3 {M^ - 21tPt cos{e)) - 8l^W3 



(40) 



Some examples of the general phase space surface are 
given in Fig. |9] 

XL CONCLUSIONS AND OUTLOOK 

We have studied in detail the phase space of the sim- 
plest interesting hadron collider process involving an un- 
observable particle and only one mass to be determined. 
Naturally, the crucial ingredients are the phase space pro- 
jections onto the observable momenta, their limits, and 
the distances of actual events from these limits. 

The edge of the projected phase space is given by 
the formal singularity condition, Eq. (12), which can be 



re-expressed as a function of the observable momenta, 
Eq. ([T4| and coincides with the consuetudinary trans- 
verse mass function, Eq. ([T]). 

The "singularity variables" are various measures of the 
distance of an actual event to the nearest edge singular- 
ity. We have determined in ^VI the measure for which 
SV is statistically optimal, which we called the "statisti- 
cal squared derivative" and turns out to be well known to 
statisticians as the "Fisher information" [9j. The actual 
result ought to have been obvious for starters: the op- 
timal variable -H^ in Eqs. ( 33|39 )- is orthogonal to the 
phase space at all points and is thereby most sensitive to 
the unknown mass, which determines the overall scale of 
momenta. 

Somewhat unexpectedly, singularity variables other 
than the optimal one develop fake singularities away from 
the edge singularity at a = 0, see Fig. ([5|, top. The W's 
natural width suffices to merge the edge and fake singu- 
larities, resulting in a peak at cr > 0, see Fig. ([5|, bottom. 
This is a potential complication in their use as tools to 
determine the unknown mass(es). 

Contrary to the SCs, the SVs depend on longitudinal 
momenta. In the case of single-W production, whether or 
not they may add significant precision to a measurement 
of the W mass depends on the prior level of understand- 
ing of the relevant pdfs [5 , a question that we have not 
tried to investigate. It may well turn out, contrariwise, 
that the optimal SV, with a value of M. determined by 
the transverse observables, is a good tool to constrain the 
pdfs. 

The SVs contain the SC as a factor. This makes them 
"weak", in that they are highly correlated to the infor- 
mation contained in the SC, as discussed in 3lXl The SVs 



where ni is computed as in Eq. (31 ) in terms of the phase 



are functions of an auxiliary mass A^, and of transverse 
and longitudinal momenta. Varying A4 as in the lower 
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Fig. ^ is an efficient way to "focus" on the relevant mass 
scale, particularly for cases with more than one unknown 
mass [7J . But it does not add to the precision with which 
the mass(es) may be measured. 

Whether or not the various and rather negative con- 
clusions of the previous two paragraphs apply to cases 
wherein more than one particle decays into invisible ones 
is a question that we plan to discuss in subsequent work. 
The answer requires a detailed study of the relevant phase 
space, akin to the one in this note. 
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Figure 9: The general phase space of Eq. (36) for M = 1 and 
Pt = 1- Top, Center, Bottom are for cos^ = —1, 0, 1. 
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